Fast Principal Component Analysis of Large-Scale Genome-Wide Data
نویسندگان
چکیده
منابع مشابه
Fast Principal Component Analysis of Large-Scale Genome-Wide Data
Principal component analysis (PCA) is routinely used to analyze genome-wide single-nucleotide polymorphism (SNP) data, for detecting population structure and potential outliers. However, the size of SNP datasets has increased immensely in recent years and PCA of large datasets has become a time consuming task. We have developed flashpca, a highly efficient PCA implementation based on randomized...
متن کاملLarge-Scale Principal Component Analysis on LiveJournal Friends Network
Principal Component Analysis (PCA) is a general means of unsupervised exploration that can be used to find basic motives and organizational themes, the guidance in friends network formation. The applications of PCA include Kleinberg’s ranking algorithm as well as spectral graph partitioning. We extend the applicability of PCA to very large scale social networks by handling the abundance of smal...
متن کاملFast Large-scale Mixture Modeling with Component-specific Data Partitions
Remarkably easy implementation and guaranteed convergence has made the EM algorithm one of the most used algorithms for mixture modeling. On the downside, the E-step is linear in both the sample size and the number of mixture components, making it impractical for large-scale data. Based on the variational EM framework, we propose a fast alternative that uses component-specific data partitions t...
متن کاملLarge-Scale Sparse Principal Component Analysis with Application to Text Data
Sparse PCA provides a linear combination of small number of features that maximizes variance across data. Although Sparse PCA has apparent advantages compared to PCA, such as better interpretability, it is generally thought to be computationally much more expensive. In this paper, we demonstrate the surprising fact that sparse PCA can be easier than PCA in practice, and that it can be reliably ...
متن کاملFast Iterative Kernel Principal Component Analysis
We develop gain adaptation methods that improve convergence of the Kernel Hebbian Algorithm (KHA) for iterative kernel PCA (Kim et al., 2005). KHA has a scalar gain parameter which is either held constant or decreased according to a predetermined annealing schedule, leading to slow convergence. We accelerate it by incorporating the reciprocal of the current estimated eigenvalues as part of a ga...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PLoS ONE
سال: 2014
ISSN: 1932-6203
DOI: 10.1371/journal.pone.0093766